<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Performance considerations</title>
<link rel="stylesheet" href="../../../../../../doc/src/boostbook.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
<link rel="home" href="../../index.html" title="Boost.Optional">
<link rel="up" href="../../optional/tutorial.html" title="Tutorial">
<link rel="prev" href="type_requirements.html" title="Type requirements">
<link rel="next" href="../../optional/reference.html" title="Reference">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<table cellpadding="2" width="100%"><tr>
<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../boost.png"></td>
<td align="center"><a href="../../../../../../index.html">Home</a></td>
<td align="center"><a href="../../../../../../libs/libraries.htm">Libraries</a></td>
<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
<td align="center"><a href="../../../../../../more/index.htm">More</a></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="type_requirements.html"><img src="../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../../optional/tutorial.html"><img src="../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="../../optional/reference.html"><img src="../../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_optional.tutorial.performance_considerations"></a><a class="link" href="performance_considerations.html" title="Performance considerations">Performance
      considerations</a>
</h3></div></div></div>
<p>
        Technical details aside, the memory layout of <code class="computeroutput"><span class="identifier">optional</span><span class="special">&lt;</span><span class="identifier">T</span><span class="special">&gt;</span></code>
        for a generic <code class="computeroutput"><span class="identifier">T</span></code> is more-less
        this:
      </p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">T</span><span class="special">&gt;</span>
<span class="keyword">class</span> <span class="identifier">optional</span>
<span class="special">{</span>
  <span class="keyword">bool</span> <span class="identifier">_initialized</span><span class="special">;</span>
  <span class="identifier">std</span><span class="special">::</span><span class="identifier">aligned_storage_t</span><span class="special">&lt;</span><span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">t</span><span class="special">),</span> <span class="keyword">alignof</span><span class="special">(</span><span class="identifier">T</span><span class="special">)&gt;</span> <span class="identifier">_storage</span><span class="special">;</span>
<span class="special">};</span>
</pre>
<p>
        Lifetime of the <code class="computeroutput"><span class="identifier">T</span></code> inside
        <code class="computeroutput"><span class="identifier">_storage</span></code> is manually controlled
        with placement-<code class="computeroutput"><span class="keyword">new</span></code>s and pseudo-destructor
        calls. However, for scalar <code class="computeroutput"><span class="identifier">T</span></code>s
        we use a different way of storage, by simply holding a <code class="computeroutput"><span class="identifier">T</span></code>:
      </p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">T</span><span class="special">&gt;</span>
<span class="keyword">class</span> <span class="identifier">optional</span>
<span class="special">{</span>
  <span class="keyword">bool</span> <span class="identifier">_initialized</span><span class="special">;</span>
  <span class="identifier">T</span> <span class="identifier">_storage</span><span class="special">;</span>
<span class="special">};</span>
</pre>
<p>
        We call it a <span class="emphasis"><em>direct</em></span> storage. This makes <code class="computeroutput"><span class="identifier">optional</span><span class="special">&lt;</span><span class="identifier">T</span><span class="special">&gt;</span></code> a
        trivially-copyable type for scalar <code class="computeroutput"><span class="identifier">T</span></code>s.
        This only works for compilers that support defaulted functions (including
        defaulted move assignment and constructor). On compilers without defaulted
        functions we still use the direct storage, but <code class="computeroutput"><span class="identifier">optional</span><span class="special">&lt;</span><span class="identifier">T</span><span class="special">&gt;</span></code>
        is no longer recognized as trivially-copyable. Apart from scalar types, we
        leave the programmer a way of customizing her type, so that it is reconized
        by <code class="computeroutput"><span class="identifier">optional</span></code> as candidate
        for optimized storage, by specializing type trait <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">opitonal_config</span><span class="special">::</span><span class="identifier">optional_uses_direct_storage_for</span></code>:
      </p>
<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">X</span> <span class="comment">// not trivial</span>
<span class="special">{</span>
  <span class="identifier">X</span><span class="special">()</span> <span class="special">{}</span>
<span class="special">};</span>

<span class="keyword">namespace</span> <span class="identifier">boost</span> <span class="special">{</span> <span class="keyword">namespace</span> <span class="identifier">optional_config</span> <span class="special">{</span>

  <span class="keyword">template</span> <span class="special">&lt;&gt;</span> <span class="keyword">struct</span> <span class="identifier">optional_uses_direct_storage_for</span><span class="special">&lt;</span><span class="identifier">X</span><span class="special">&gt;</span> <span class="special">:</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">true_type</span> <span class="special">{};</span>

<span class="special">}}</span>
</pre>
<h5>
<a name="boost_optional.tutorial.performance_considerations.h0"></a>
        <span class="phrase"><a name="boost_optional.tutorial.performance_considerations.controlling_the_size"></a></span><a class="link" href="performance_considerations.html#boost_optional.tutorial.performance_considerations.controlling_the_size">Controlling
        the size</a>
      </h5>
<p>
        For the purpose of the following analysis, considering memory layouts, we
        can think of it as:
      </p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">T</span><span class="special">&gt;</span>
<span class="keyword">class</span> <span class="identifier">optional</span>
<span class="special">{</span>
  <span class="keyword">bool</span> <span class="identifier">_initialized</span><span class="special">;</span>
  <span class="identifier">T</span> <span class="identifier">_storage</span><span class="special">;</span>
<span class="special">};</span>
</pre>
<p>
        Given type <code class="computeroutput"><span class="identifier">optional</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span></code>, and
        assuming that <code class="computeroutput"><span class="keyword">sizeof</span><span class="special">(</span><span class="keyword">int</span><span class="special">)</span> <span class="special">==</span>
        <span class="number">4</span></code>, we will get <code class="computeroutput"><span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">optional</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;)</span>
        <span class="special">==</span> <span class="number">8</span></code>.
        This is so because of the alignment rules, for our two members we get the
        following alignment:
      </p>
<p>
        <span class="inlinemediaobject"><img src="../../images/opt_align1.png" alt="opt_align1"></span>
      </p>
<p>
        This means you can fit twice as many <code class="computeroutput"><span class="keyword">int</span></code>s
        as <code class="computeroutput"><span class="identifier">optional</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span></code>s into
        the same space of memory. Therefore, if the size of the objects is critical
        for your application (e.g., because you want to utilize your CPU cache in
        order to gain performance) and you have determined you are willing to trade
        the code clarity, it is recommended that you simply go with type <code class="computeroutput"><span class="keyword">int</span></code> and use some 'magic value' to represent
        <span class="emphasis"><em>not-an-int</em></span>, or use something like <a href="https://github.com/akrzemi1/markable" target="_top"><code class="computeroutput"><span class="identifier">markable</span></code></a> library.
      </p>
<p>
        Even if you cannot spare any value of <code class="computeroutput"><span class="keyword">int</span></code>
        to represent <span class="emphasis"><em>not-an-int</em></span> (e.g., because every value is
        useful, or you do want to signal <span class="emphasis"><em>not-an-int</em></span> explicitly),
        at least for <code class="computeroutput"><span class="identifier">Trivial</span></code> types
        you should consider storing the value and the <code class="computeroutput"><span class="keyword">bool</span></code>
        flag representing the <span class="emphasis"><em>null-state</em></span> separately. Consider
        the following class:
      </p>
<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">Record</span>
<span class="special">{</span>
  <span class="identifier">optional</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">_min</span><span class="special">;</span>
  <span class="identifier">optional</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">_max</span><span class="special">;</span>
<span class="special">};</span>
</pre>
<p>
        Its memory layout can be depicted as follows:
      </p>
<p>
        <span class="inlinemediaobject"><img src="../../images/opt_align2.png" alt="opt_align2"></span>
      </p>
<p>
        This is exactly the same as if we had the following members:
      </p>
<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">Record</span>
<span class="special">{</span>
  <span class="keyword">bool</span> <span class="identifier">_has_min</span><span class="special">;</span>
  <span class="keyword">int</span>  <span class="identifier">_min</span><span class="special">;</span>
  <span class="keyword">bool</span> <span class="identifier">_has_max</span><span class="special">;</span>
  <span class="keyword">int</span>  <span class="identifier">_max</span><span class="special">;</span>
<span class="special">};</span>
</pre>
<p>
        But when they are stored separately, we at least have an option to reorder
        them like this:
      </p>
<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">Record</span>
<span class="special">{</span>
  <span class="keyword">bool</span> <span class="identifier">_has_min</span><span class="special">;</span>
  <span class="keyword">bool</span> <span class="identifier">_has_max</span><span class="special">;</span>
  <span class="keyword">int</span>  <span class="identifier">_min</span><span class="special">;</span>
  <span class="keyword">int</span>  <span class="identifier">_max</span><span class="special">;</span>
<span class="special">};</span>
</pre>
<p>
        Which gives us the following layout (and smaller total size):
      </p>
<p>
        <span class="inlinemediaobject"><img src="../../images/opt_align3.png" alt="opt_align3"></span>
      </p>
<p>
        Sometimes it requires detailed consideration what data we make optional.
        In our case above, if we determine that both minimum and maximum value can
        be provided or not provided together, but one is never provided without the
        other, we can make only one optional memebr:
      </p>
<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">Limits</span>
<span class="special">{</span>
  <span class="keyword">int</span>  <span class="identifier">_min</span><span class="special">;</span>
  <span class="keyword">int</span>  <span class="identifier">_max</span><span class="special">;</span>
<span class="special">};</span>

<span class="keyword">struct</span> <span class="identifier">Record</span>
<span class="special">{</span>
  <span class="identifier">optional</span><span class="special">&lt;</span><span class="identifier">Limits</span><span class="special">&gt;</span> <span class="identifier">_limits</span><span class="special">;</span>
<span class="special">};</span>
</pre>
<p>
        This would give us the following layout:
      </p>
<p>
        <span class="inlinemediaobject"><img src="../../images/opt_align4.png" alt="opt_align4"></span>
      </p>
<h5>
<a name="boost_optional.tutorial.performance_considerations.h1"></a>
        <span class="phrase"><a name="boost_optional.tutorial.performance_considerations.optional_function_parameters"></a></span><a class="link" href="performance_considerations.html#boost_optional.tutorial.performance_considerations.optional_function_parameters">Optional
        function parameters</a>
      </h5>
<p>
        Having function parameters of type <code class="computeroutput"><span class="keyword">const</span>
        <span class="identifier">optional</span><span class="special">&lt;</span><span class="identifier">T</span><span class="special">&gt;&amp;</span></code>
        may incur certain unexpected run-time cost connected to copy construction
        of <code class="computeroutput"><span class="identifier">T</span></code>. Consider the following
        code.
      </p>
<pre class="programlisting"><span class="keyword">void</span> <span class="identifier">fun</span><span class="special">(</span><span class="keyword">const</span> <span class="identifier">optional</span><span class="special">&lt;</span><span class="identifier">Big</span><span class="special">&gt;&amp;</span> <span class="identifier">v</span><span class="special">)</span>
<span class="special">{</span>
  <span class="keyword">if</span> <span class="special">(</span><span class="identifier">v</span><span class="special">)</span> <span class="identifier">doSomethingWith</span><span class="special">(*</span><span class="identifier">v</span><span class="special">);</span>
  <span class="keyword">else</span>   <span class="identifier">doSomethingElse</span><span class="special">();</span>
<span class="special">}</span>

<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
<span class="special">{</span>
  <span class="identifier">optional</span><span class="special">&lt;</span><span class="identifier">Big</span><span class="special">&gt;</span> <span class="identifier">ov</span><span class="special">;</span>
  <span class="identifier">Big</span> <span class="identifier">v</span><span class="special">;</span>
  <span class="identifier">fun</span><span class="special">(</span><span class="identifier">none</span><span class="special">);</span>
  <span class="identifier">fun</span><span class="special">(</span><span class="identifier">ov</span><span class="special">);</span> <span class="comment">// no copy</span>
  <span class="identifier">fun</span><span class="special">(</span><span class="identifier">v</span><span class="special">);</span>  <span class="comment">// copy constructor of Big</span>
<span class="special">}</span>
</pre>
<p>
        No copy elision or move semantics can save us from copying type <code class="computeroutput"><span class="identifier">Big</span></code> here. Not that we need any copy, but
        this is how <code class="computeroutput"><span class="identifier">optional</span></code> works.
        In order to avoid copying in this case, one could provide second overload
        of <code class="computeroutput"><span class="identifier">fun</span></code>:
      </p>
<pre class="programlisting"><span class="keyword">void</span> <span class="identifier">fun</span><span class="special">(</span><span class="keyword">const</span> <span class="identifier">Big</span><span class="special">&amp;</span> <span class="identifier">v</span><span class="special">)</span>
<span class="special">{</span>
  <span class="identifier">doSomethingWith</span><span class="special">(</span><span class="identifier">v</span><span class="special">);</span>
<span class="special">}</span>

<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
<span class="special">{</span>
  <span class="identifier">optional</span><span class="special">&lt;</span><span class="identifier">Big</span><span class="special">&gt;</span> <span class="identifier">ov</span><span class="special">;</span>
  <span class="identifier">Big</span> <span class="identifier">v</span><span class="special">;</span>
  <span class="identifier">fun</span><span class="special">(</span><span class="identifier">ov</span><span class="special">);</span> <span class="comment">// no copy</span>
  <span class="identifier">fun</span><span class="special">(</span><span class="identifier">v</span><span class="special">);</span>  <span class="comment">// no copy: second overload selected</span>
<span class="special">}</span>
</pre>
<p>
        Alternatively, you could consider using an optional reference instead:
      </p>
<pre class="programlisting"><span class="keyword">void</span> <span class="identifier">fun</span><span class="special">(</span><span class="identifier">optional</span><span class="special">&lt;</span><span class="keyword">const</span> <span class="identifier">Big</span><span class="special">&amp;&gt;</span> <span class="identifier">v</span><span class="special">)</span> <span class="comment">// note where the reference is</span>
<span class="special">{</span>
  <span class="keyword">if</span> <span class="special">(</span><span class="identifier">v</span><span class="special">)</span> <span class="identifier">doSomethingWith</span><span class="special">(*</span><span class="identifier">v</span><span class="special">);</span>
  <span class="keyword">else</span>   <span class="identifier">doSomethingElse</span><span class="special">();</span>
<span class="special">}</span>

<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
<span class="special">{</span>
  <span class="identifier">optional</span><span class="special">&lt;</span><span class="identifier">Big</span><span class="special">&gt;</span> <span class="identifier">ov</span><span class="special">;</span>
  <span class="identifier">Big</span> <span class="identifier">v</span><span class="special">;</span>
  <span class="identifier">fun</span><span class="special">(</span><span class="identifier">none</span><span class="special">);</span>
  <span class="identifier">fun</span><span class="special">(</span><span class="identifier">ov</span><span class="special">);</span> <span class="comment">// doesn't compile</span>
  <span class="identifier">fun</span><span class="special">(</span><span class="identifier">v</span><span class="special">);</span>  <span class="comment">// no copy</span>
<span class="special">}</span>
</pre>
</div>
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
<td align="left"></td>
<td align="right"><div class="copyright-footer">Copyright © 2003-2007 Fernando Luis Cacciola Carballal<br>Copyright © 2014-2021 Andrzej Krzemieński<p>
        Distributed under the Boost Software License, Version 1.0. (See accompanying
        file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
      </p>
</div></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="type_requirements.html"><img src="../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../../optional/tutorial.html"><img src="../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="../../optional/reference.html"><img src="../../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
</body>
</html>
